Reverberant speech recognition based on denoising autoencoder

نویسندگان

Takaaki Ishii

Hiroki Komiyama

Takahiro Shinozaki

Yasuo Horiuchi

Shingo Kuroiwa

چکیده

Denoising autoencoder is applied to reverberant speech recognition as a noise robust front-end to reconstruct clean speech spectrum from noisy input. In order to capture context effects of speech sounds, a window of multiple short-windowed spectral frames are concatenated to form a single input vector. Additionally, a combination of short and long-term spectra is investigated to properly handle long impulse response of reverberation while keeping necessary time resolution for speech recognition. Experiments are performed using the CENSREC-4 dataset that is designed as an evaluation framework for distant-talking speech recognition. Experimental results show that the proposed denoising autoencoder based front-end using the shortwindowed spectra gives better results than conventional methods. By combining the long-term spectra, further improvement is obtained. The recognition accuracy by the proposed method using the short and long-term spectra is 97.0% for the open condition test set of the dataset, whereas it is 87.8% when a multicondition training based baseline is used. As a supplemental experiment, large vocabulary speech recognition is also performed and the effectiveness of the proposed method has been confirmed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

Deep neural network (DNN)-based approaches have been shown to be effective in many automatic speech recognition systems. However, few works have focused on DNNs for distant-talking speaker recognition. In this study, a bottleneck feature derived from a DNN and a cepstral domain denoising autoencoder (DAE)-based dereverberation are presented for distant-talking speaker identification, and a comb...

متن کامل

Environment-dependent denoising autoencoder for distant-talking speech recognition

In this paper, we propose an environment-dependent denoising autoencoder (DAE) and automatic environment identification based on a deep neural network (DNN) with blind reverberation estimation for robust distant-talking speech recognition. Recently, DAEs have been shown to be effective in many noise reduction and reverberation suppression applications because higher-level representations and in...

متن کامل

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge

This paper describes several strategies tested in BUT’s submission to the IARPA ASpIRE challenge. The ASpIRE task was to develop an automatic speech recognition (ASR) system for wide-band noisy reverberant speech, while only clean CTS (Fisher) data was allowed for ASR training. To solve this task, we have started with augmenting Fisher data with artificially noised and reverberated versions. Th...

متن کامل

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognitio...

متن کامل

Reverberant Speech Recognition Combining Deep Neural Networks and Deep Autoencoders

We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder for enhancing the speech feature parameters, and the recognition is performed using a DNN-HMM acoustic models trained on multi-condition data. The system was evaluated through the ASR task in Chime Challenge 2014. The DNN-H...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Reverberant speech recognition based on denoising autoencoder

نویسندگان

چکیده

منابع مشابه

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification

Environment-dependent denoising autoencoder for distant-talking speech recognition

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

Reverberant Speech Recognition Combining Deep Neural Networks and Deep Autoencoders

عنوان ژورنال:

اشتراک گذاری